GeneID in Drosophila.

نویسندگان

  • G Parra
  • E Blanco
  • R Guigó
چکیده

GeneID is a program to predict genes in anonymous genomic sequences designed with a hierarchical structure. In the first step, splice sites, and start and stop codons are predicted and scored along the sequence using position weight matrices (PWMs). In the second step, exons are built from the sites. Exons are scored as the sum of the scores of the defining sites, plus the log-likelihood ratio of a Markov model for coding DNA. In the last step, from the set of predicted exons, the gene structure is assembled, maximizing the sum of the scores of the assembled exons. In this paper we describe the obtention of PWMs for sites, and the Markov model of coding DNA in Drosophila melanogaster. We also compare other models of coding DNA with the Markov model. Finally, we present and discuss the results obtained when GeneID is used to predict genes in the Adh region. These results show that the accuracy of GeneID predictions compares currently with that of other existing tools but that GeneID is likely to be more efficient in terms of speed and memory usage.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Graph-Search Framework for GeneId Ranking

One step in the curation process is geneId finding— the task of finding the database identifier of every gene discussed in an article. GeneId-finding was studied experimentally in the BioCreatIvE challenge (Hirschman et al., 2005), which developed testbed problems for each of three model organisms (yeast, mice, and fruitflies). Here we consider geneId ranking, a relaxation of geneId-finding in ...

متن کامل

Minor allele C of chromosome 1p32 single nucleotide polymorphism rs11206510 confers risk of ischemic stroke in the Chinese Han population.

BACKGROUND AND PURPOSE Genome-wide association studies found that the common allele T of single nucleotide polymorphism rs11206510 on chromosome 1p32 was associated with increased low-density lipoprotein-cholesterol levels (LDL-C) and with risk of coronary artery disease (CAD) in white populations. The goals of this study are to determine whether rs11206510 is associated with LDL-C and CAD in a...

متن کامل

Concentration dependent effect of morphine, aspirin, capsaicin and chili pepper hydro alcoholic extract on thermal and chemical pain model in fruit fly (Drosophila melanogaster)

Introduction: Pain research using animal models is related to ethical concerns, so invertebrates and insects have been recommended by researchers. In the present study, the nociceptive and antinociceptive effects of capsaicin, aspirin, morphine and chili extract were examined using fruit fly (Drosophila melanogaster) as an alternative for rodent pain model. Methods: Stage 3 of larvae and ad...

متن کامل

Lack of association between the APLNR variant rs9943582 with ischemic stroke in the Chinese Han GeneID population

Stroke is one of the most common causes of death worldwide. Genetic risk factors have been found to play important roles in the pathology of ischemic stroke. In a previous genome-wide association study, a functional variant (rs9943582, -154G/A) in the 5' flanking region of the apelin receptor gene (APLNR) was shown to be significantly associated with stroke in the Japanese population. However, ...

متن کامل

Toxicological Evaluation of a New Lepidopteran Insecticide, Flubendiamide, in Non-Target Drosophila melanogaster Meigen (Diptera: Drosophilidae)

Background: Flubendiamide, comparatively a new pesticide designed to eradicate lepidopteran insect pests is known to have low risk to birds, mammals, fish, algae, honey bees, non-target arthropods, earthworms, soil macro- and micro-organisms, non-target plants as well as sewage treatment organisms; however, the risk assessment for aquatic invertebrates from metabolite could not be finalized wit...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Genome research

دوره 10 4  شماره 

صفحات  -

تاریخ انتشار 2000